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■ Abstract. We develop a combinatorial approach to the study of semi- 

groups and monoids with finite presentations satisfying small overlap 
m : conditions. In contrast to existing geometric methods, our approach fa- 

cilitates a sequential left-right analysis of words which lends itself to the 
development of practical, efficient computational algorithms. In par- 
ticular, we obtain a highly practical linear time solution to the word 
problem for monoids and semigroups with finite presentations satisfying 
^ (— I the condition C(4), and a polynomial time solution to the uniform word 

{ ■ problem for presentations satisfying the same condition. 

Small overlap conditions are simple and natural combinatorial conditions 
on semigroup and monoid presentations, which serve to limit the complexity 
of derivation sequences between equivalent words in the generators. They 
form a natural semigroup-theoretic analogue of the small cancellation condi- 
CD ! tions which are extensively used in combinatorial and computational group 

theory [5] . It is well known that every group admitting a finite presentation 
satisfying suitable small cancellation conditions is word hyperbolic in the 



sense of Gromov [2], and in particular has word problem solvable in linear 
time. 

C — ■ ! In the 1970s, Remmers [6j [7] developed an elegant geometric theory of 

small overlap semigroups, using the natural semigroup-theoretic analogue 
of the van Kampen diagrams extensively employed in combinatorial group 
theory (see for example [5]). He applied his methods to show that semi- 
5_i ■ groups satisfying sufficiently small overlap conditions have what would now 

be called linear Dehn function, that is, that the minimum length of a deriva- 
tion sequence between any two equivalent words is bounded above by a linear 
function of the word lengths. In theory, it follows immediately that one can 
test if two words in the generators for such a semigroup are equivalent, by ex- 
haustively searching the (finite) space of all applicable derivation sequences 
of the given length, to see if any of them transforms one word to the other. 
However, the number of possible derivation sequences, and hence the time 
complexity of this algorithm, is exponential in the word length. More sophis- 
ticated techniques (such as applications of graph reachability algorithms) are 
of course applicable, but the problem remains one of searching a space of 
exponential size, and so we cannot really hope that this approach will lead 
to a tractable solution for the word problem. The question naturally arises, 
then, of how hard the word problem really is in these semigroups. 
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In this paper, we develop a new approach to the study of this important 
class of semigroups and monoids, along purely combinatorial lines. While 
our work lacks some of the mathematical elegance of Remmers' approach 
- indeed our foundational results are of a rather technical nature and our 
proofs mainly by case analysis — it has the advantage of permitting a se- 
quential (left-right) analysis of elements, which for computational purposes 
seems more relevant than a geometric viewpoint. Two computational con- 
sequences of the theory we develop are of particular interest. The first is 
a linear time (on a two-tape Turing machine) algorithm to solve the word 
problem in any semigroup with a presentation satisfying Remmers' condition 
C(4). The second is a polynomial time (more precisely, in the RAM model, 
quadratic in the presentation length and linear in the word length) solution 
to the uniform word problem for presentations satisfying the same condi- 
tion. While the proofs of correctness and of the time complexity bounds for 
these algorithms are rather technical, the algorithms themselves are quite 
straightforward to describe and eminently suitable for practical implemen- 
tation; the author is currently working on an implementation for the GAP 
computer algebra system pQ. 

In addition to this introduction, this paper comprises five sections. In 
Section [T] we briefly recall the definitions of small overlap semigroups and 
monoids, together with some of their properties, and introduce some nota- 
tion and terminology which will be used in the rest of the paper. Section [2] 
establishes some technical, but nonetheless important, combinatorial prop- 
erties of small overlap monoids, which are then used in Section [3] to give a 
sequential characterisation of equivalence for two words in the generators of 
a C(4) presentation. Section [J] shows how this characterisation can be used 
to develop a linear time algorithm for the solution of the word problem of 
a fixed small overlap presentation. Finally, in Section [5] we apply our tech- 
niques to the solution of the uniform word problem for C(4) presentations; 
we also observe that one test efficiently whether an arbitrary presentation 
satisfies the condition C(4). 

The relationship of this work to the geometric approach developed by 
Remmers |6] perhaps deserves a further comment. As already mentioned, 
our approach to small overlap semigroups is entirely combinatorial and, in 
its finished state, makes no direct use of Remmers' geometric machinery. 
However, the author would most likely never have arrived at this viewpoint 
without the insight and intuition afforded by Remmers' approach, and the 
reader interested in fully understanding the present paper may find it helpful 
to study also Remmers' work in parallel. Some of his results have been 
given a very accessible treatment by Higgins [3] , but unfortunately the only 
complete source still seems to be his thesis [6]. 



1. Preliminaries 

We assume familiarity with basic notions of combinatorial semigroup the- 
ory, including free semigroups and monoids, and semigroup and monoid pre- 
sentations. In all but Section [5] of the paper, which is devoted to uniform 
decision problems, we assume we have a fixed finite presentatation for a 
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monoid (or semigroup — we shall see shortly that the difference is unim- 
portant). Words are assumed to be drawn from the free monoid on the 
generating alphabet unless otherwise stated. We write u = v to indicate 
that two words are equal in the free monoid, and u = v to indicate that 
they represent the same element of the semigroup presented. We say that 
a word p is a possible prefix of u if there exists a (possibly empty) word w 
with pw = u, that is, if the element represented by u lies in the right ideal 
generated by the element represented by p. The empty word is denoted e. 

A relation word is a word which occurs as one side of a relation in the 
presentation. A piece is a word in the generators which occurs as a factor in 
sides of two different relations, or as a factor of both sides of a relation, or in 
two different (possibly overlapping) places within one side of a relation. To 
ensure a uniform treatment for free semigroups and monoids, we make the 
convention that the empty word e is always a piece, even if the presentation 
has no relations. 

The presentation is said to satisfy the condition C(n), where n is a positive 
integer, if no relation word can be written as the product of strictly fewer 
than n pieces. Thus for each n, C(n + 1) is a strictly stronger condition 
than C(n). We briefly mention another related condition. The presentation 
satisfies the condition OL(x), where < x < 1 if whenever a piece p occurs 
as a factor of a relation word R we have \p\ < x\R\. Notice that if n is a 
positive integer, then a semigroup satisfying OL(l/n) will certainly satisfy 
C(n + 1). 

The weakest meaningful small overlap condition, (7(1), says that no rela- 
tion word is a product of zero pieces, that is, that e is not a relation word. 
From this we see that in a small overlap monoid presentation, no non-empty 
word can be equivalent to the empty word, that is, no non-empty word can 
represent the identity. It follows that every small overlap monoid presenta- 
tion is also interpretable as a semigroup presentation, and that the monoid 
presented is isomorphic to the semigroup presented with an adjoined identity 
element. For simplicity in what follows we shall focus upon small overlap 
monoids, but from each of our results one can immediately deduce a corre- 
sponding result for small overlap semigroups. 

For each relation word R, let Xr and Zr denote respectively the longest 
prefix of R which is a piece, and the longest suffix of R which is a piece. If 
the presentation satisfies (7(3) then R cannot be written as a product of two 
pieces, so this prefix and suffix cannot meet; thus, R admits a factorisation 
XrYrZr for some non-empty word Yr. If moreover the presentation satisfies 
the stronger condition (7(4) then R cannot be written as a product of three 
pieces, so Yr is not a piece. The converse also holds: a (7(3) presentation 
such that no Yr is a piece is a (7(4) presentation. We call Xr, Yr and 
Zr the maximal piece prefix, the middle word and the maximal piece suffix 
respectively of R. 

Assuming now that the presentation satisfies at least the condition (7(3), 
we shall use the letters X, Y and Z (sometimes with adornments or sub- 
scripts) exclusively to represent maximal piece prefixes, middle words and 
maximal piece suffixes respectively of relation words; two such letters with 
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the same subscript or adornment (or with none) will be assumed to stand 
for the appropriate factors of the same relation word. 

If R is a relation word we write R for the (necessarily unique, as a result of 
the small overlap condition) word such that (R, R) or (R, R) is a relation in 
the presentation. We write Xr, Yr and Zr for X-g, and respectively. 
(This is an abuse of notation since, for example, the word Xr may be a 
maximal piece prefix of two distinct relation words, but we shall be careful 
to ensure that the meaning is clear from the context.) 

2. Weak Cancellation Properties 

To perform efficient computations with words, it is very helpful to be 
able to process them in a sequential, left-right manner. To facilitate this 
in the case of the word problem for small overlap monoids, we need to 
know what can be deduced about the equivalence (or non-equivalence) of 
two words from prefixes of those words. This section develops a theory 
with this end in mind, including a number of results which can be viewed 
as weak cancellativity conditions satisfied by small overlap monoids. We 
assume throughout a fixed monoid presentation satisfying the small overlap 
condition C(4). 

We first introduce some terminology. A relation prefix of a word is a prefix 
which admits a (necessarily unique, as a consequence of the small overlap 
condition) factorisation of the form aXY where X and Y are the maximal 
piece prefix and middle word respectively of some relation word XYZ. An 
overlap prefix (of length n) of a word u is a relation prefix which admits 
an (again necessarily unique) factorisation of the form WfiV/.X^K^ . . . X n Y n 
where 

• n > 1; 

• no factor of the form XqYq begins before the end of the prefix a; 

• for each 1 < i < n, Ri = XiYiZi is a relation word with Xj and Zj 
the maximal piece prefix and suffix respectively; and 

• for each 1 < i < n, Y- is a proper, non-empty prefix of 1^. 

Notice that if a word has a relation prefix, then the shortest such must be 
an overlap prefix. A relation prefix aXY of a word u is called clean if u 
does not have a prefix 

aXY 'XiYx 

where X\ and Y\ are the maximal piece prefix and middle word respectively 
of some relation word, and Y' is a proper, non-empty prefix of Y. Clean 
overlap prefixes, in particular, will play a crucial role in what follows. 

Proposition 1. Let aX{Y[X2Y2 ■ ■ ■ X„Y n be an overlap prefix of some word. 
Then this prefix contains no relation word as a factor ( except possibly X n Y n 
in the case that Z n = e). 

Proof. Suppose that the given overlap prefix contains a relation word R as 
a factor. By the definition of an overlap prefix, no occurrence of R can 
begin before the end of the prefix a, so we may assume that R is a factor of 
X\Y[X-2X2- ■ - XnYn. It follows that either R contains XiY( as a factor for 
some i, or else R is a factor of XiY( Xi + {Y! +1 for some i (where Y( +l = Y n if 
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i + 1 = n) and we may assume without loss of generality that the occurrence 
of R overlaps non-trivially with the prefix XjY( . 

In the former case, since Xi is a maximal piece prefix of XiYiZi and 
Y- is non-empty, X{Y- cannot be a piece; it follows then that we must 
have R = XiYiZi with the occurrence in the obvious place. In the latter 
case, R is the product of a non-empty factor of XiYiZi with a factor of the 
Xi+iYi+iZi+i; but by the small overlap assumption, R cannot be written 
as a product of two pieces, so it must again be that R = XiYiZi with the 
occurrence in the obvious place. 

Now if i = n then, since R is a factor of the given relation prefix, we must 
clearly have R = XiYiZi = XiYi so that Zi = e. On the other hand, if i < n 
then either XiYiZi contains Xi+iY!,^ as a factor, which contradicts the fact 
that Xi + \ is a maximal piece prefix of XiYiZi, or else (recalling that Y- is a 
proper prefix of Yi) we see that Xi + iY- +1 contains a non-empty suffix of Yi 
followed by Zi, which contradicts the fact that Zi is a maximal piece suffix 
oiXiYiZ,. □ 

Proposition 2. Let u be a word. Every overlap prefix of u is contained in 
a clean overlap prefix of u. 

Proof. We fix u and prove by induction on the difference between the length 
of u and the length of the given overlap prefix, that is, on the length of 
that part of u not contained in the given overlap prefix. For the base case, 
observe that an overlap prefix constituting the whole of u is necessarily 
clean. Now suppose aX\Y[ . . . X n Y n is an overlap prefix, and that the result 
holds for longer overlap prefixes of u. If the given prefix is clean then there 
is nothing to prove. Otherwise, by the definition of a clean overlap prefix, 
there exist words X and Y , being the maximal piece prefix and the middle 
word respectively of some relation word, and a proper non-empty prefix 
of Y n such that 

aX x Y[...X n Y' n XY 
is a prefix of u. Clearly this is an overlap prefix of u which is strictly longer 
than the original one, and so by induction is contained in a clean overlap 
prefix of u. But now the original overlap prefix of is contained in a clean 
overlap prefix, as required. □ 

Corollary 1. If a word u has no clean overlap prefix, then it contains no 
relation word as a factor, and so if u = v then u = v. 

Proof. Suppose u has no clean overlap prefix. If u contained a relation word 
as a factor then clearly it would have a relation prefix, that is, a prefix 
of the form oXrYr for some relation word R. But by our observations 
above, the shortest relation prefix of u would be an overlap prefix, and so by 
Proposition El is contained in a clean overlap prefix of u. Thus, u contains 
no relation word as a factor. It follows easily that no relations can be applied 
to u, so the only word equivalent to u is u itself. □ 



Lemma 1. If u = wXYZu' with wXY a clean overlap prefix then wXY is 
a clean overlap prefix of wXY Zu' . 

Proof. Let 

wXY = aXtYl . . . X n Y^XY (1) 
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be the factorisation given by the definition of a clean overlap prefix. Then 
wXYZu' has a prefix 

wXY = aX x Y{ . . . X n YpCY (2) 

If n > 1 it is immediate from the factorisation given by ([!]) that wXY is an 
overlap prefix of wXYZu' . In the case n = 0, however, we must consider the 
possibility that the prefix aXY = wXY contains a factor of the form XqYq 
overlapping the final initial segment a. Suppose it does. Then recalling that 
Yq is not a piece, and so cannot be a factor of XY, we see that aXY admits 
a factorisation 

aXY = bX Y<pCY (3) 

for some non-empty prefix Yq or Yq. Moreover, Yq must be a proper prefix 
of Yq, or else a would have a factor XqYq, contradicting the fact that wXY 
was a clean overlap prefix of u. This shows that wXY is an overlap prefix 
of wXYZu'. 

It remains to show that the given overlap prefix is clean. Suppose for 
a contradiction that it is not. Then by definition, there is a factor of the 
form XY overlapping the end of the prefix aXY; but this factor is either 
by contained in XY Z (contradicting the supposition that X is a maximal 
piece prefix of a relation word XY Z) or contains a non-empty suffix of Y 
followed by Z (contradicting the assumption that Z is a maximal piece suffix 
oiXYZ). □ 

The following lemma is fundamental to our approach to C(4) monoids. 
With careful application it seems to permit a comparable understanding to 
that resulting from Remmers' geometric theory, but in a purely combinato- 
rial (and hence more computationally orientated) way. 

Lemma 2. Suppose a word u has clean overlap prefix wXY . If u = v then 
v has overlap prefix either wXY or wXY , and no relation word occurring as 
a factor of v overlaps this prefix, unless it is XY Z or XY Z as appropriate. 

Proof. Since wXY is an overlap prefix of u, it has by definition a factorisa- 
tion 

wXY = aXxY( . . . X n Y^XY 

for some n > 0. We use this fact to prove the claim by induction on the 
length r of a rewrite sequence (using the defining relations) from u to v. 

In the case r = 0, we have u = v, so v certainly has (clean) overlap 
prefix vXY. By Proposition [H no relation word factor can occur entirely 
within this prefix (unless it is XY and Z = e). If a relation word factor of v 
overlaps the end of the given overlap prefix and entirely contains XY then, 
since XY is not a piece, that relation word must clearly be XY Z. Finally, 
a relation word cannot overlap the end of the given overlap prefix but not 
contain the suffix XY, since this would clearly contradicts the fact that the 
given overlap prefix is clean. 

Suppose now for induction that the lemma holds for all values less than 
r, and that there is a rewrite sequence from u to v of length r. Let u\ be 
the second term in the sequence, so that u\ is obtained from u by a single 
rewrite using the defining relations, and v from ui by r — 1 rewrites. 
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Consider the relation word in u which is to be rewritten in order to obtain 
Ui, and in particular its position in u. By Proposition Q3 this relation word 
cannot be contained in the clean overlap prefix wXY , unless it is XY where 
Z = e. 

Suppose first that the relation word to be rewritten contains the final 
factor Y of the given clean overlap prefix. (Note that this covers in par- 
ticular the case that the relation word is XY and Z = e.) From the C(4) 
assumption we know that Y is not a piece, so we may deduce that the re- 
lation word is XY Z contained in the obvious place. In this case, applying 
the rewrite clearly leaves u± with a prefix wXY, and by Lemma [TJ this is 
a clean overlap prefix. Now v can be obtained from u\ by r — 1 rewrite 
steps, so it follows from the inductive hypothesis that v has overlap prefix 
either wXY or wXY = wXY , and that no relation word occurring as a 
factor of v overlaps this prefix, unless it is XY Z or XY Z as appropriate; 
this completes the proof in this case. 

Next, we consider the case in which the relation word factor in u to be 
rewritten does not contain the final factor Y n of the clean overlap prefix, but 
does overlap with the end of the clean overlap prefix. Then u has a factor 
of the form XY , where X is the maximal piece prefix and Y the middle 
word of a relation word, which overlaps X n Y n , beginning after the start of 
Y n . This clearly contradicts the assumption that the overlap prefix is clean. 

Finally, we consider the case in which the relation word factor in u which 
is to be rewritten does not overlap the given clean overlap prefix at all. Then 
obviously, the given clean overlap prefix of u remains an overlap prefix of 
u\. If this overlap prefix is clean, then a simple application of the inductive 
hypothesis again suffices to prove that v has the required property. 

There remains, then, only the case in which the given overlap prefix is no 
longer clean in u\. Then by definition there exist words X and Y, being a 
maximal piece prefix and middle word respectively of some relation word, 
such that ui has the prefix 

aX{Y{ . . . X n _iY^_ 1 X n Y^Xy 

for some proper, non-empty prefix Y^ of Y n . Now certainly this is not a prefix 
of u, since this would contradict the assumption that aX±Y( . . . X n Y n is a 
clean overlap prefix of u. So we deduce that u\ must contain a relation word 
overlapping the final XY. This relation word cannot contain the final factor 
XY, since this would again contradict the assumption that aX\Y[ . . . X n Y n 
is a clean overlap prefix of u. Nor can the relation word contain the final 
factor y, since Y is not a piece. Hence, u\ must have a prefix 

aX x Y{ . . . Xn^Y^XnY^XY'R 

for some relation word and proper, non-empty prefix Y' of Y and some 
relation word R. Suppose R = XrYrZr where Xr and Zr are the maximal 
piece prefix and suffix respectively. Then it is readily verified that 

aX x Y[ . . . X n ^Y' n ^X n Y' n XY' X R Y R 

is a clean overlap prefix of u\. But now by the inductive hypothesis, v has 
prefix either 

aX x Y[ . . . X n ^Y' n _ x X n Y' n XY'X R Y R (4) 
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or 




(5) 



which in turn is easily seen to have prefix 



aX\Y[ . . . X n -{Y n _iX n Y n . 



(6) 



Moreover, by Proposition [H the prefix @ or of v contains no relation 
word as a factor (unless it is the final factor XrYr and Zr = e) and it 
follows easily that no relation word factor overlaps the prefix (|6|) of v . □ 

The lemma has the following easy corollary. 

Corollary 2. Suppose a word u has (not necessarily clean) overlap pre- 
fix wXY. If u = v then v has a prefix w and contains no relation word 
overlapping this prefix. 

Proof. By Proposition the overlap prefix wXY of u is contained in a clean 
overlap prefix w'X'Y' of u. Now by Lemma v has a prefix w' and contains 
no relation word overlapping this prefix. But it is easily seen that w' must 
be at least as long as w, so that v has a prefix w and contains no relation 
word overlapping this prefix, as required. □ 

The following proposition describes a very weak left cancellation property 
of small overlap monoids; it will allow us to restrict attention to words with 
a prefix of the form XY where X and Y are the maximal piece prefix and 
middle word respectively of some relation word. 

Proposition 3. Suppose a word u has an overlap prefix aXY and that 
u = aXYu" . Then u = v if and only if v = av' where v' = XYu" . 

Proof. Clearly if v = av' with v' = X\Y\u" then it is immediate that v = 
av 1 = aX{Y\u" = v. 

Conversely, suppose u = v. Since aXY is an overlap prefix, by Propo- 
sition Q] it cannot contain a relation word starting before the end of a. By 
Corollary [21 v has prefix a, say v = av' . Now consider a rewrite sequence, 
using the defining relations, from u to v. Again using Corollary [21 every 
term in this sequence will have prefix a, and contain no relation word over- 
lapping this prefix. It follows that the same sequence of rewrites can be 
applied to take X{Yiu" to v' , so that v' = X{Y\v!' as required. □ 

We now introduce some more terminology. Let u be a word with shortest 
relation prefix aXY , and let p be a piece. We say that u is p-inactive if 
pu has shortest relation prefix paXY and p-active otherwise. The following 
proposition describes another weak cancellation property of small overlap 
monoids. 



Proposition 4. Letu be a word andp apiece. Ifu is p-inactive thenpu = v 
if and only if v = pw for some w with u = w. 



SMALL OVERLAP MONOIDS 



9 



Proof. Suppose u has shortest relation prefix aXY, so that pu has shortest 
relation prefix paXY . Suppose u = aXYu" . Iipu = v then by Proposition [3] 
(since the shortest relation prefix is clearly an overlap prefix), we have v = 
pav' where v' = XYu" . Now setting w = av' we have v = pw and u = 
aXYu' = av' = aw. The converse implication is obvious. □ 

Proposition 5. Let Z% and Zi be maximal piece suffixes of relation words 
and suppose u is Z\-active and Z2-active. Then Z\ and Z2 have a common 
non-empty suffix, and if z is the maximal common suffix then 

(i) u is z-active; 

(ii) Z\u = v if and only if v = Z\v' where z\z = Z\ and v' = zu; and 

(iii) Z2U = v if and only if v = Z2V' where Z2Z = Z2; and v' = zu. 

Proof. Let bX 3 Y 3 and 0X4X4 be the shortest relation prefixes of Z\u and 
Z2V respectively. Since u is Zi-active and ^-active, we must have \b\ < \Z\\ 
and |c| < \Z2 |. Moreover, since Z\ is a piece and X 3 is a maximal piece 
prefix of the relation word X 3 Y 3 Z 3 we must have \Z\\ < \bX 3 \, and similarly 
\Z 2 \ < \cX 4 \. 

It follows that u has prefixes X' 3 Y 3 and X'^Y^ where X' 3 and X'^ are proper 
(perhaps empty) suffixes of X 3 and X4 respectively. Thus, one of X' 3 Y 3 and 
is a prefix of the other, and so either Y3 is a factor of X4Y4 and 
hence of or I4 is a factor of X3Y3 and hence of X3Y3Z3. But 

by the C(4) assumption, neither Y3 nor Y4 is a piece so the only possible 
explanation is that X3I3Z3 and X4I4Z4 are the same relation word, and 
moreover X' 3 = X' 4 . 

Now let p be such that pX' 3 = X3. We have already observed that X' 3 is 
a proper prefix of X%, so p is non-empty. Now Z\ = bp, and also 

pX' A = P X 3 = x 3 = x 4 

so by symmetry we have Z2 = cp. Hence, p is a common non-empty suffix 
of Z\ and Z2. 

Now let z be the maximal common suffix of Z\ and Z2. Let y, Z\ and 
Z2 be such that z = yp, Z\ = z\z and Z2 = z^z. Then clearly b = z\y 
and c = Z2y. Now zu = ypu has a relation prefix 2/X3Y3, from which it is 
immediate that u is z-active so that (i) holds. 

To show that (ii) holds, let v! be such that u = X^Y^u', and suppose 
u = v. Now 

Z x u = z 1 zX' 3 Y 3 u' = zxypX'^Y-iv! = zxyX^v! 

where ,21^X3 Y3 is the shortest relation prefix, and hence is an overlap prefix. 
Hence, by Proposition [3] we have v = z\yv" where v" = X3Y3U' . But now 
setting v' = yv" we have v = Ziv', Z\Z = Z\ and 

v = yv" = yX 3 Y 3 u = ypX' 3 Y 3 u = zX 3 Y 3 u = zu 

as required. Conversely, if v = z\v' where z\Z = Z\ and v' = zu then we 
have 

Z\U = Z\ZU = Z\V = v. 

This completes the proof that (ii) holds, and an entirely symmetric argument 
shows that (iii) holds. □ 
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Corollary 3. Let Z\ and Z2 be maximal piece suffixes of relation words. 
Suppose u is Z<i-active and Z\u = Z\v. Then Z^u = Z^v. 

Proof. If u is Z\ -inactive then by Proposition d] we have u = v , and so 
certainly Z2U = Z2V. 

On the other hand, if u is Zi-active then let z be the maximal common 
suffix of Z\ and Z2 and let z\ and Z2 be such that z\z = Z\ and Z2Z = Z2. 
Then by the Proposition [5jii), since Z\u = Z\v we have Z\v = z\v' where 
v' = zu. But from z\zv = Z\v = z\v' we deduce that v' = zv, so now we 
have 

Z2U = Z2ZU = Z2V 1 = Z2ZV = Z2V. 

□ 

Corollary 4. Let u and v be words and Z\ and Z2 be maximal piece suffixes 
of relation words. Suppose there exist words u = u±, . . . ,u n = v such that 

Z\u\ = Z1U2, Z 2 u 2 = Z2U3, Z\u?, = Z1U4, 

\Z\u n -\ 
[Z2U n -i 

Then either Z\u = Z\v or Z\u = Z^v or both. 

Proof. Fix u and v, and suppose n is minimal (allowing exchanging Z\ and 
Z2 if necessary) such that a sequence of equivalences as above exists. Sup- 
pose further for a contradiction that n > 2. If U2 was Zi-inactive then by 
Proposition H] we would have u\ = U2 so that Z2U1 = Z2U2 = Z2U3, contra- 
dicting the minimality assumption on n. Similarly, if U2 was ^-inactive then 
we would have U2 = 1*3 so that Z\U\ = Z\U2 = Z±u^ again contradicting the 
minimality assumption on n. 

Thus, 1*2 is both Zi-active and ^-active. But now since Z\U\ = Z1U2, 
we apply Corollary [3] to see that Z2U1 = Z2U2 = Z2U3, again providing the 
required contradiction. □ 

3. Sequential Characterisation of Equality 

In this section we use the theory developed in Section [2] to provide a new 
characterisation of when two words in the generators of a small overlap pre- 
sentation represent the same element of the monoid presented. In Section U] 
we shall use this characterisation to develop an efficient algorithm to solve 
the word problem. 

We first present a lemma which gives a set of mutually exclusive combi- 
natorial conditions, the disjunction of which is necessary and sufficient for 
two words of a certain form to represent the same element. 

Lemma 3. Suppose u = XYu' where XY is a clean overlap prefix of u. 
Then u = v if and only if one of the following mutually exclusive conditions 
holds: 

(1) u = XYZu" and v = XY Zv" and either Zu" = Zv" or Zu" = Zv" 
or both; 

(2) u = XYu' , v = XYv' , and Z fails to be a prefix of at least one of u' 
and v' , and u' = v' ; 



= Z\u n if n is even 
= Z2U n if n is odd. 
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(3) u = XYZu", v = XYZv" and either Zu" = Zv" or Zu" = Zv" or 
both; 

(4) u = XYu', v = XYZv" but Z is not a prefix of u' and vl = Zv" ; 

(5) u = XYZu" , v = XYv' but Z is not a prefix of v' and Zu" = v' ; 

(6) u = XYu' , v = XYv' , Z is not a prefix of u' and Z is not a prefix 
of v' , but Z = z\z, Z = z 2 z, u' = z\u" , v' = Z2v" where u" = v" and 
z is the maximal common suffix of Z and Z, z is non-empty, and z 
is a possible prefix of u" . 

Proof. First we treat the claim that the conditions (l)-(6) are mutually 
exclusive. Since X is a maximal piece prefix of XYZ and Y is non-empty, 
XY is not a piece. An entirely similar argument shows that XY is not a 
piece. In particular, neither of XY and XY is a prefix of the other, and so 
v can have at most one of them as a prefix. Thus, conditions (l)-(2) are not 
consistent with conditions (3)- (6). The mutual exclusivity of (1) and (2) is 
self-evident from the definitions, and likewise that of (3)-(6). 

It is easily verified that each of the conditions (l)-(5) imply that u = v. 
We show next that (6) implies that u = v. Since z is a possible prefix of u" 
and u" = v" , we may write u" = zx = v" for some word x. Now we have 

u = XYu' = XYz lU " = XYz x zx = XYZx 

= XYZx = XYzizx = XYz 2 v" = XYv' = v. 

What remains, which is the main burden of the proof, is to prove that u = v 
implies that at least one of the conditions (l)-(6) holds. To this end, then, 
suppose u = v; then there is a rewriting sequence taking u to v. By Lemma[5J 
every term in this sequence will have prefix either XY or XY and this prefix 
can only be modified by the application of the relation (XYZ, XYZ) in the 
obvious place. We now prove the claim by case analysis. 

By Lemma[2j v begins either with XY or with XY. Consider first the case 
in which v begins with XY; we split this into two further cases depending 
on whether u and v both begin with the full relation word XYZ; these will 
correspond respectively to conditions (1) and (2) in the statement of the 
lemma. 

Case (1). Suppose u = XYZu" and v = XYZv". Then clearly there is 
a rewriting sequence taking u to v which by Lemma [2] can be broken up as: 

u = XYZu" ^* XYZu x ->XYZui ->* XYZu 2 

-> XYZu 2 ->* ► XYZu n ->* XYZv" = v 

where none of the steps in the sequences indicated by ^* involves rewriting 
a relation word overlapping with the prefix XY or XY as appropriate. It 
follows that there are rewriting sequences. 

Zu" — ►* Zui, Zu\ Zu 2 , Zu 2 -^-* ZU3, . . . , Zu n — ►* Zv" 

Now by Corollary HI either Zu" = Zv" or Zu" = Zv" as required to show 
that condition (1) holds. 

Case (2). Suppose now that u = XYu', v = XYv' and Z fails to be 
a prefix of at least one of u' and v' . We must show that u' = v'; suppose 
for a contradiction that this does not hold. We consider only the case that 
Z is not a prefix of u'; the case that Z is not a prefix of v' is symmetric. 
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We consider rewriting sequences from u = XYu' to v = XYv' . Again using 
Lemma [2j we see that there is either (i) such a sequence taking u to v 
containing no rewrites of relation words overlapping the prefix XY, or (ii) 
such a sequence taking u to v which can be broken up as: 

u = XYu ^* XYZ Ul -JCYZui ->* XYZu 2 

-> XYZu 2 ->* ► XYZu n ->* XYv' = v 

where none of the intermediate words in the sequences indicated by — >* con- 
tains a relation word overlapping with the prefix XY or XY as appropriate. 
In case (i) there is clearly a rewrite sequence taking v! to v' so that u' = v' 
as required. In case (ii), there are rewriting sequences. 

v! — >* Zui, Zu\ — >* Zu 2 , Zu 2 — >* Z113, . . . , Zu n —>* v . 

Notice that, since v! does not begin with Z, we can deduce from Proposi- 
tion H] that u\ is Z-active. By Corollary 01 either Zu\ = Zu n or Zu\ = Zu n . 
In the latter case, since u\ is Z-active, Corollary [3] tells us that we also have 
Zu\ = Zu n in any case. But now 

u' = Zu\ = Zu n = v' 

so condition (2) holds and we are done. 

We have now shown that if v begins with XY then either condition (1) or 
condition (2) holds. It remains to consider the case in which v begins with 
XY, and show that one of conditions (l)-(6) must be satisfied. We split 
the analysis here into four cases depending on whether u begins with the 
full relation word XYZ, and whether v begins with the full relation word 
XY Z; these four cases will correspond respectively to conditions (3)-(6) in 
the statement of the lemma. 

Case (3). Suppose u = XYZu" and v = XYZv". Then u = XYZu" = 
v = XYZv", so by the same argument as in case (1) we have either Zu" 
/. r" or Zu" = Zv" as required to show that condition (3) holds. 

Case (4). Suppose u = XYu' and v = XYZv" but Z is not a prefix 
of u' . Then u = XYu' = v = XY Zv" . Now applying the same argument 
as in case (2) (with XY Zv" in place of v and setting v' = Zv") we have 
u' = v' = Zv" so that condition (4) holds. 

Case (5). Suppose u = XYZu", v = XYv' but Z is not a prefix of v'. 
Then we have XYZu" = u = v = XYv' . Now applying the same argument 
as in case (1) (but with XYZu" in place of u and setting u' = Zu") we 
obtain u' = v' = Zu" so that condition (5) holds. 

Case (6). Suppose u = XYu', v = XYv' and that Z is not a prefix of u' 
and Z is not a prefix of v '. It follows this time there is a rewriting sequence 
taking u to v of the form 

u = XYu' ^* XYZu x -» XYZui ^* XYZu 2 -» XYZu 2 

^* ► XYZu n ^* XYv' = v 

where once more none of the intermediate words in the sequences indicated 
by — >* contains a relation word overlapping with the prefix XY or XY as 
appropriate. Now there are rewriting sequences. 

u — >* Zu\, Zu\ — Zu 2 , Zu 2 Zu$, . . . , Zu n _i Zu n , Zu n v' . 
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Notice that, since u' does not begin with Z, we may deduce from Proposi- 
tion 0] that ui is Z-active. By Corollary 01 either Zu\ = Zu n or Zu\ = Zu n . 
In the latter case, since u\ is Z-active, Corollary [3] tells us that we also have 
Zu\ = Zu n anyway. But now 

u = Zu\ = Zu n 

where v! does not begin with Z, and also v' = Zu n were v' does not begin 
with Z. By applying PropositionU]twice, we deduce that u n is both Z-active 
and Z-active. 

Let z be the maximal common suffix of Z and Z. Then applying Propo- 
sition [5] (with Z\ = Z and Z2 = Z), we see that z is non-empty and 

• v! = Z\u" where Z = z\z and u" = zu n ; and 

• v 1 = ziv" where Z = Z2Z and v" = zu n . 

But then we have u" = zu n = v" and also z is a possible prefix of u" as 
required to show that condition (6) holds. □ 

Lemma Ogives a first clue as to how one might solve the word problem for 
a small overlap monoid by analysing words sequentially from left to right. 
The natural strategy is as follows. First, use Proposition [3] to reduce to 
the case in which the words both have clean relation prefixes of the form 
XY or XY. Now by examining short prefixes, one can clearly always rule 
out at least five of the six mutually exclusive conditions of the lemma. The 
remaining condition will involve equivalence of words derived from suffixes 
of u and v, so apply the same approach recursively to test whether this 
condition is satisfied. 

This approach meets with several apparent obstacles. Firstly, it is not 
clear that the words derived from the suffixes of u and v, which must be 
tested for equivalence in the recursive call, are shorter than the original 
words u and v; for example, a relation word XY Z may be shorter than the 
maximal piece suffix Z of the word on the other side of the relation. In fact 
the recursive call will not always involve shorter words, but it will involve 
words which are simpler in a more subtle sense, so that the algorithm still 
terminates rapidly. Secondly, some of the conditions involve a disjunction 
of equivalence of two pairs of words derived from the suffixes; testing both 
would require two recursive calls, potentially leading to exponential time 
complexity. It tranpires, though, that the theory of activity and inactiv- 
ity developed in Section [2] means that one recursive call will always suffice. 
Finally, some of the conditions require us to check the possible prefixes of 
words derived from suffixes; this problem is solved by the following develop- 
ment of Lemma El which gives simultaneous conditions for two words to be 
equal, and to admit a given piece as a possible prefix. 

Lemma 4. Suppose u = XYu' where XY is a clean overlap prefix, and 
suppose p is a piece. Then u = v and p is a possible prefix of u if and only 
if one of the following mutually exclusive conditions holds: 

(1') u = XYZu" and v = XYZv", either_ Zu" = Zv" or Zu" = Zv" , 

and also p is a prefix of either X or X or both; 
(2') u = XYu' , v = XYv' , and Z fails to be a prefix of at least one of u' 

and v' , and u' = v' , and also either 
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— p is a prefix of X 

— p is a prefix of X and Z is a possible prefix of u' ; 
or both; 

(3') u = XYZu", v = XYZv" and either Zu" = Zv" or Zu" = Zv" or 
both, and also p is a prefix of X or X or both; 

(4') u = XYu' , v = XYZv" but Z is not a prefix of u' and v! = Zv" , 
and also p is a prefix of X or X or both; 

(5') u = XYZu" , v = XYv' but Z is not a prefix of v' and Zu" = v' , 
and also p is a prefix of X or X or both; 

(6') u = XYu' , v = XYv' , Z is not a prefix of u' and Z is not a prefix 
of v' , but Z = z\z, Z = Z2Z, u' = z\u" , v' = Z2v" where u" = v" , z 
is the maximal common suffix of Z and Z , z in non-empty, z is a 
possible prefix of u" , and also p is a prefix of X or X or both. 

Proof. Mutual exclusivity of the six conditions is proved exactly as for 
Lemma El 

Suppose now that one of the six conditions above applies. Each condition 
clearly implies the corresponding condition from Lemma El so we deduce 
immediately that u = v. We must show, using the fact that p is a prefix of 
X or of X, that p is a possible prefix of u, or equivalently of v. 

In case (1'), if p is a prefix of X then it is a prefix of u, while if p is a 
prefix of X then it is a prefix of XYZu" which is clearly equivalent to u. In 
case (2'), if p is a prefix of X then it is again a prefix of u, while if p is a 
prefix of X and Z is a possible prefix of u' , say u' = Zw, then 

u = XYu = XYZw = XYZw 

where the latter has p as a prefix. In the remaining cases u begins with X 
and v begins with X, so p is a prefix of either u or v, and hence a possible 
prefix of u. 

Conversely, suppose u = v and p is a possible prefix of u. Then exactly 
one of the six conditions in Lemma [3] applies. By Lemma [21 every word 
equivalent to u begins with either XY or XY. Since p is a piece, X is the 
maximal piece prefix of XY Z, and X is the maximal piece prefix of XY Z 
it follows that p is a prefix of either X or X. If any but condition (2) of 
Lemma [2] is satisfied, this suffices to show that the corresponding condition 
from the statement of Lemma H] holds. 

If condition (2) from Lemma [3] applies, we must show additionally that 
either p is a prefix of X, or p is a prefix of X and Z is a possible prefix 
of u' . Suppose p is not a prefix of X. Then by the above, p is a prefix 
of X. It follows from Lemma [21 that the only way the prefix XY of the 
word u can be changed using the defining relations is by application of the 
relation {XY Z, XY Z). In order for this to happen, one must clearly be able 
to rewrite u = XYu' to a word of the form XY Zw; consider the shortest 
possible rewriting sequence which achieves this. By Lemma[21 no term in the 
sequence except for the last term will contain a relation word overlapping 
the initial XY. It follows that the same rewriting steps rewrite u' to Zw, 
so that Z is a possible prefix of u' , as required. □ 
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4. The Algorithm 

In this section we present an algorithm, for a fixed monoid presentation 
satisfying C(4), which takes as input arbitrary words u and v and a piece p, 
and decides whether u = v and p is a possible prefix of u. It will transpire 
that this algorithm can be implemented to run time in linear in the shorter 
of u and v. In particular, by setting p = e we obtain an algorithm to 
solve the word problem in time linear in the smaller of the input words. 
The algorithm is shown (in recursive/functional pseudocode) in Figure 1. 
Our first objective is to prove the correctness of the algorithm, that is, that 
whenever the algorithm terminates, it provides the output it gives is correct. 



Lemma 5. Suppose u and v are words and p a piece. Then the algorithm 
WP-PREFIX(u,w,p) 

• outputs YES only if u = v and p is a possible prefix of u; and 

• outputs NO only if u ^ v or p is not a possible prefix of u. 

Proof. We prove correctness using induction on the number n of recursive 
calls. 

Consider first the base case n = 0, that is, where the algorithm terminates 
without a recursive call. Suppose u, v and p are such that this happens. 
We consider each of the possible lines at which termination may occur, 
establishing in each case that the output produced is correct. 
Line [3l If u = e, v = e and p = e then clearly u = v and p is a possible prefix 

of u, so the output YES is correct. 
Line |4]. If u = e [respectively, v = e] then it follows easily from the small 
overlap condition C(4) that no relations can be applied to u [v]; 
indeed a relation which could be applied to u [v] would have to have 
e as one side, but e is a piece and hence cannot be a relation word. 
Hence, we can have that u = v and p is a possible prefix of u only 
if u = v = p = e. In this case, this condition is not satisfied, so the 
output NO is correct. 
Line 0. In this case, u does not begin with a clean overlap prefix of the form 
XY. So by Proposition El every word equivalent to u must begin 
with the same letter as u. Hence, if u and v do not begin with the 
same letter then we cannot have u = v, so the output NO is correct. 
Line [9l Again, u does not begin with a clean overlap prefix. If p is non-empty 
and begins with a different letter to u, then again by Proposition O 
p cannot be a possible prefix of u, so the output NO is correct. 
Line 1191 We are now in the case that u has a clean overlap prefix XY. If p 
is not a prefix of X or X then by Lemma 2] we see that p is not a 
possible prefix of u, so the output NO is correct. 
Line 1211 Once again, we are in the case that u has a clean overlap prefix XY. 

If v does not begin with either XY or XY then by Lemma [3] we 
cannot have u = v so the output NO is correct. 
Line 1431 We are now in the case that u = XYu' and v = XYv' where Z is 
not a prefix of v! and Z is not a prefix of v'. We know also that z is 
the maximal common suffix of Z and Z and z\ and z<i are such that 
Z = z\z and Z = z%z. By Lemma H] we cannot have u = v unless u' 
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WP-PREFix(it,i;,p) 



1 if u = e or v = e 

2 then if u = e and v = e and p = e 

3 then return Yes 

4 else return No 

5 elseif u does not have the form XYv! with YY a clean overlap prefix 

6 then if u and u begin with different letters 

7 then return No 

8 elseif p ^ e and u and p begin with different letters 

9 then return No 
fO else 

f 1 u <— u with first letter deleted 

12 v <— v with first letter deleted 

13 if p + e 

14 then p <— p with first letter deleted 

15 return WP-Prefix(u,i;,p) 

16 else 

17 let X, Y, v! be such that u = XYv! 

18 if p is a prefix of neither X nor X 

19 then return No 

20 elseif v does not begin either with XY or with XY 

21 then return No 

22 elseif u = XYZu" and v = XYZv" 

23 then if u" is Z-active 

24 then return WP-Prefix(Zu", ~Zv" , e) 

25 else return WP-Prefix(Zu", Zv", e) 

26 elseif u = XYu' and v = XYv' 

27 then if p is a prefix of X 

28 then return WP-Preflx(V , v', e) 

29 else return WP-Preflx(V, v', Z) 

30 elseif u = XYZu" and v = XYZv" 

31 then if u" is Z-active 

32 then return WP-Prefix(Zu', ~Zv', e) 

33 else return WP-Prefix(Zu', Zv', e) 

34 elseif « = YYV and u = XFZw" 

35 then return WP-Prefix(u', Zv" , e) 

36 elseif « = XYZu" and u = XYv' 

37 then return WP-Prefix(Zm", v', e) 

38 elseif « = XYv! and u = XYv' 

39 then let z be the maximal common suffix of Z and Z 

40 let z\ be such that Z = z\z 

41 let Z2 be such that Z = Z2-z 

42 if v! does not begin with z\ or u' does not begin with 

43 then return NO 

44 else let u" be such that u' := z\u" 

45 let v" be such that v' := z^v"; 

46 return WP-Prefix(u", v" , z) 



Figure 1. Algorithm for the Word Problem 
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and v' have the form z\u" and z%v" respectively, so if this is not the 
case, the output NO is correct. 

Now let n > and suppose for induction that the algorithm produces the 
correct output whenever it terminates after strictly fewer than n recursive 
calls. Let u, v,p be such that the algorithm terminates after n recursive calls. 
This time, we consider each of the possible places at which the first recursive 
call can be made, establishing in each case that the output produced is 
correct. 

Line 1151 In this case u does not begin with a clean overlap prefix of the form 
XY and we have u = au'. It follows by Proposition [3] that every 
word equivalent to u has the form aw where uu = v! . In particular, 
u = v = av' if and only if u' = v', p is a possible prefix exactly if 
either p = e or p = ap' where p' is a possible prefix of vt '. By the 
inductive hypothesis, the recursive call correctly establishes whether 
these conditions hold. 

Line 1241 We know that u = XYZu" , that v = XYZv" and that p is a prefix 
of X or X. By Lemma HI it follows that u = v and p is a possible 
prefix of u if and only if Zu" = Zv" or Zu" = Zv" . We also know 
that u" is Z-active, so by Corollary [31 this is true if and only if 
Zu" = Zv". _ 

Line 1251 This is the same as the previous case, except that u" is not Z-active. 

In this case, by Proposition 0] we have that Zu" = Zv" implies 
u" = v" which in turn implies Zu" = Zv" , so it suffices to test the 
latter. 

Line 1281 Here we know that u = XYu', v = XYv', that Z is not a prefix of u' 
or v 1 and that p is a prefix of X. It follows by Lemma [J] that u = v 
and p is a possible prefix of u if and only if u' = v' . 

Line 1291 This time we know that u = XYu' , v = XYv' and that p is a prefix 
of X but not of X. It follows by Lemma [H that u = v and p is a 
possible prefix of u if and only if u' = v' and Z is a possible prefix 
oiu'. 

Line 1321 Here we have u = XYZu" and v = XYZv" , and p is a prefix of X or 
X. It follows by Lemma[Uthat u = v and p is a possible prefix of u if 
and only if either Zu" = Zv" or Zu" = Zv" . We also know that u" 
is Z-active, so by Corollary (3j this is true if and only if Zu" = Zv". 

Line 1331 This is the same as the previous case, except that u" is not Z-active. 

In this case, by Proposition U] we have that Zu" = Zv" implies 
u" = v" which in turn implies Zu" = Zv" , so it suffices to test the 
latter. 

Line 1351 If we get here, we know that u = XYu' , that v = XYZv" , that Z 
is not a prefix of u' and that p is a prefix of X or X; it follows that 
u = v and p is a possible prefix of u if and only if condition (4') of 
Lemma H] holds, that is, if and only if u' = Zv" . By the inductive 
hypothesis, the recursive call will correctly estbalish if this is the 
case. 

Line 1371 The argument here is symmetric to that for termination at line [35j 
Line 1461 Having got here, we know that p is a prefix of X or X, that u = XYu' 
and v = XYv' where Z is not a prefix of u' and Z is not a prefix of 
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v'. We know also that z is the maximal common suffix of Z and Z 
and z\ and Z2 are such that Z = z\Z and Z = Z2Z. Finally, we know 
that u' = z\u" and v' = z<iv" . It follows by Lemma0]that u = v and 
p is a possible prefix of z if and only if u" = v" and z is a possible 
prefix of u". By the inductive hypothesis, the recursive call correctly 
establishes whether this holds. 



We have now shown that our algorithm produces the correct output when- 
ever it terminates, but we have not yet shown that it always terminates. In 
fact, the following theorem shows that it does so after only a linear number 
of recursive calls. 

Lemma 6. Let k be the length of the longest maximal piece suffix of a 
relation word. The number of recursive calls during execution of a call to 
WP-PREFIX(«, is bounded above by (k + 2)|u| + 1. 

Proof. For clarity in our analysis, we let Ui, Vi and pi denote the parameters 
to the ith recursive call in the execution (with in particular uq = u, vo = v 
and po = p). Each call to the function involves executing exactly one of 
the sections HHH I6H151 and [T7H36| we call these calls of type A, B and C 
respectively. We shall show that the number of calls of each of these types 
is bounded above by a linear function of \u\ so that, the total number of 
recursive calls is also bounded above by a linear function of |u|. 

First, notice that a call of type A cannot make a recursive call, so that is 
only at most one type A call in the execution. 

Now for a word x we let r(x) = if x does not have a clean overlap prefix, 
and r(x) to be the length of the part of x which follows the shortest clean 
overlap prefix, that is, \x'\ where x = aXYx' with aXY the shortest clean 
overlap prefix, otherwise. 

It is readily verified that if the ith. recursive call is of type B and itself 
makes a recursive call then we have r(uj + i) = r(itj), while if the ith. recursive 
call is of type C and itself makes a recursive call then we have r (itj.fi) < 
r(u). Since r(ui) can never be negative, it follows that the total number 
of recursive calls of type C is linearly bounded above by r(ito) + 1, which 
clearly is no more than \uq\. 

Now note that if the ith recursive call is of type B and itself makes a 
recursive call then we have = \ui\ — 1, while if the ith recursive call is 

of type C and itself makes a recursive call then we have r(ui+i) — \ u i\ + k. 

We have seen that the entire execution cannot feature more than \uq\ calls 
of type C or more than one call of type A. Hence, if the execution involves 
i recursive calls, it must include at most \uq\ calls of type C, and at least 
i — \uq\ — 1 calls of type B. It follows that, if execution involves i recursive 
calls, we must have 



Since the length of m cannot be negative, it follows that execution must 



□ 



u i\ < \ u o\ + \uo\k — (i — \uq\ — 1) 



(jfe + 2)|u| -i + 1 



terminate after at most 




□ 



It remains to justify our claim that this algorithm can be implemented 
in linear time. Since the concept of linear time is highly dependant upon 
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model of computation, it is necessary to be precise upon the model under 
consideration. We consider a Turing machine with two two-way-infinite 
read-write storage tapes, using a tape alphabet including the generators for 
our monoid and a separator symbol (Recall that a two- way- infinite tape 
can be simulated using a one-way-infinite tape in linear time [U Section 7.5], 
so the assumption of a two- way-infinite tape is essentially immaterial). If 
we assume that the input words u, v and p are initially encoded on one of 
the tapes in the form #u#v#p#, then it is easily seen that, with a linear 
amount of preprocessing, we can store the piece p in the finite state control, 
and arrange for jfuff and jfvff to be the content of the first and second 
tape respectively. 

It is straightforward to verify that, given a word u, one can check whether 
u has a clean overlap prefix of the form XY, and if so find X, Y and the 
corresponding Z, by analysing a prefix of u of bounded length. Similarly, 
for a given maximal piece suffix Z, we can check whether u is Z -active by 
analysing a prefix of u of bounded length. It follows that each recursive 
step of our algorithm involves analysing prefixes of u and v of bounded 
length, before possibly making a recursive call, with u and v modified only 
by changing prefixes of bounded length. Clearly any analysis of a bounded 
length prefix can be performed in constant time; moreover, if a recursive call 
is required then the tape contents can be modified to contain the parameters 
for that call, again in constant time. It follows that the algorithm can be 
implemented with execution time bounded above by a linear function of the 
number of recursive calls in the execution, which by Lemma [6] is bounded 
above by a linear function of the length of u. 

Moreover, by swapping u and v at the start of the computation if nec- 
essary, we may assume without loss of generality that u is shorter than v. 
Thus we obtain the following. 

Theorem 1. For each every monoid presentation satisfying C{4), there ex- 
ists a two-tape Turing machine which solves the corresponding word problem 
in time linear in the shorter of the input words. 

The reader may initially be surprised by the fact that one can test equiv- 
alence of two words in time bounded by a function of the shorter word - 
indeed, this bound potentially does not even afford time to fully read the 
longer word! However, Remmers showed that, for a fixed C(3) presentation, 
the length of the longer of two equivalent words is bounded by a linear func- 
tion of the length of the shorter [U Theorem 5.2.14]. Thus, if the difference 
in lengths of two words is too great, one may conclude without further anal- 
ysis that the words are not equivalent. In fact Remmers' result is the only 
possible explanation for this phenomemon, so the fact that this property 
holds for C(4) presentations can also be deduced from Theorem [H 

5. Uniform Decision Problems 

In Section S] we developed a linear time algorithm to solve the word prob- 
lem for a fixed small overlap presentation. Since our method of describing 
the algorithm was entirely constructive, one might reasonably expect that 
it also gives rise to a solution for the uniform word problem for C(4) presen- 
tations, that is the algorithmic problem of, given a C(4) presentation and 
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two words, deciding whether the words represent the same element of the 
monoid presented. In this section, we shall see that this is indeed the case, 
and show that the resulting algorithm remains fast. 

To avoid unnecessary technicalities, we describe and analyse the algo- 
rithms using the RAM model of computation; in particular this allows us 
to assume that elementary operations involving generators from the presen- 
tation (such as comparing two generators) are single steps performable in 
constant time. The exact time complexity of a Turing machine implemen- 
tation would depend upon the number of tapes and the precise encoding of 
the input, but would certainly remain polynomial of low degree in the input 
size. 

We begin with some simple results describing the complexity of some 
elementary computations with a finite monoid presentation. If {srf \ 3%) is a 
finite presentation we denote by the cardinality of the alphabet &?, and 
by \3?\ the sum length of the relation words in 3?. Where the meaning is 
clear, we shall abuse notation by using 3% also to denote the set of relation 
words in the presentation. 

Proposition 6. There is a RAM algorithm which, given a presentation (&/ \ 
3?) and a word w, computes the maximum piece prefix (and/or maximum 
piece suffix) ofw in time 0(\w\\&\). In particular, there is a RAM algorithm 
to decide, given the same inmput, decides whether the word w is a piece in 
time 0{\w\\M\). 

Proof. For each relation word R € 3? and position 1 < i < \R\ in that word 
we can compute in time 0(|u;|) the length n of the longest common prefix 
of w and Ri . . . R\r\ (where Rj represents the j'th letter of R). Our machine 
does this for each relation word and each position in that relation word in 
turn, recording as it goes along (i) the maximum value of n attained so far, 
and (ii) the maximum value of n which has been attained or exceeded at 
least twice. The latter, upon completion, is clearly the length of the longest 
piece prefix of w, and the total time taken for execution is 



as claimed. An obvious dual algorithm can be used to find the longest piece 



Corollary 5. There is a RAM algorithm which, given as input a presenta- 
tion (g/ | St), decides in time 0{\3$\ 2 ) whether the presentation satisfies the 
condition C(4). 

Proof. Our machine begins by computing the maximum piece prefix Xr and 
maximum piece suffix Zr for each relation word R € St\ by Proposition O 
this can be done in time 



\ReM / 

It then tests, in time 0(\3?\), whether for any of the relation words R we 
have \Xr\ + \Zr\ > \R\. If so then some relation word is a product of two 




suffix of w. 



□ 
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pieces, so the presentation does not even satisfy the weaker condition C(3) 
and we are done. 

Otherwise, the machine computes, again in time 0(\M\), the middle word 
Yr of each relation word. By our remarks in Section [IJ the presentation sat- 
isfies C(4) if and only if none of the words Yr is a piece. Using Proposition [6] 
again, this condition can be tested in time 

o(j2\ Y *\^\) = °(i^i 2 )- 

Thus, we have described a RAM algorithm to test a presentation (gf \ £%) 
for the C(4) condition in time 0(|^| 2 ). □ 

Theorem 2. There is a RAM algorithm which, given as input a C(4) pre- 
sentation {A | R) and two words u,v £ A* , decides whether u andv represent 
the same element of the semigroup presented in time 

0(|^| 2 min(M,H)) . 

Proof. Suppose we are given a C(4) presentation (A \ R) and two words 
u, v € <s/* . Just as in the proof of Proposition [6l the machine begins by 
finding for every relation R the maximum piece prefix Xr, the maximum 
piece suffix Zr and the middle word Yr, in time 0(|J?| 2 ). 

It now has the information required to apply the algorithm WP-PREFIX 
given above. A simple line- by-line analysis shows that each line, and hence 
each recursive call, can be executed in time Oi\S%\). By Lemma El the 
number of recursive calls is bounded above by (k + 2)\u\ + 1 where k, being 
the length of the longest maximum piece suffix of a relation word, is less 
than \!%\. Thus, this part of the algorithm terminates in time 0(|^| 2 |ii|). 

As above we may assume, by exchanging u and v at the start of the 
computation if necessary, that \u\ < \v\ so that min(|u|, \ v |) = \u\. It follows 
that the uniform word problem can be solved in time O (\&\ 2 min(|u|, |t;|)) 
as claimed. □ 
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